home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
IRIX Base Documentation 1998 November
/
IRIX 6.5.2 Base Documentation November 1998.img
/
usr
/
relnotes
/
compiler_dev
/
ch3.z
/
ch3
Wrap
Text File
|
1998-11-02
|
37KB
|
924 lines
- 1 -
Base Development 7.2.1 Release Notes
- 2 -
DDDDooooccccuuuummmmeeeennnntttt NNNNuuuummmmbbbbeeeerrrr 000000008888----1111777788882222----000033330000
3. _N_e_w__F_e_a_t_u_r_e_s__o_f__T_h_i_s__R_e_l_e_a_s_e
The features in this chapter are new or
significantly changed in the Base Compiler
Development software since the MIPSpro 7.1
release. Other older features of note are also
discussed.
3.1 _N_e_w__M_a_n__P_a_g_e_s__f_o_r__M_I_P_S_p_r_o__7_._2_._1
The opt(5), lno(5) and o32(5) man pages now
provide information about their specific
options. In the past, this information was
bundled in the cc(1), CC(1), f77(1) and f90(1)
man pages. Also included in MIPSpro 7.2.1 are
the omp_lock(3), omp_nested(3) and
omp_threads(3) man pages which are useful when
doing development for multiprocessors.
3.2 _N_e_w__A_u_t_o_m_a_t_i_c__P_a_r_a_l_l_e_l_i_z_a_t_i_o_n__O_p_t_i_o_n
The 7.2 release of the MIPSpro compilers marked
a major revision of the auto-parallelizer. The
new product incorporates automatic
parallelization into the other optimizations
performed by the MIPSpro compilers. Previous
versions relied on preprocessors to provide
source-to-source conversions prior to
compilation. This change provides several
benefits to developers:
Automatic parallelization is integrated with
optimizations for single processors
A set of options and pragmas consistent with the
rest of the MIPSpro compilers
Better run-time and compile-time performance
For more information, please refer to the
auto_p(5) man pages.
- 3 -
NOTE: In order to run the new automatic
parallelization, you must purchase the MIPSpro
Auto Parallelization Option (SC4-APO-7.2) and
install the license for it (FEATURE name string
= auto_pp).
3.3 _C_o_m_p_i_l_e_r__S_y_s_t_e_m__C_h_a_n_g_e_s
This section lists changes and additions to
compilers and development tools since the
MIPSpro 7.1 release.
3.3.1 _N_e_w__O_p_t_i_o_n_s__a_n_d__D_e_f_a_u_l_t_s__i_n__M_I_P_S_p_r_o__7_._2_._1
The following new options which control the
inlining of memory intrinsics
have been added to the -_O_P_T option group:
- 4 -
-OPT:....
mem_intrinsics[=(OFF|ON)]
Enable inlining of memory intrinsics
(memcpy, memmove, memset, bcopy, bzero,
blkclr) in some cases. This option has
an effect only if the corresponding
procedure has a "#pragma intrinsic"
for it. The standard include files
contain this pragma for these routines
(string.h, memory.h, bstring.h, strings.h).
Note that the pragmas are disabled by
default with the -ansi option. The option
-D__INLINE_INTRINSICS can be used to
enable intrinsics in the -ansi mode.
(default OFF)
memcpy_cannot_overlap[(OFF|ON)]
The compiler assumes by default that
the operands of the "memcpy" routine
can overlap. This option allows the
compiler to assume that the operands do
not overlap and can thus generate
better code. (default OFF)
bcopy_cannot_overlap[(OFF|ON)]
The compiler assumes by default that
the operands of the "bcopy" routine
can overlap. This option allows the
compiler to assume that the operands
do not overlap and can thus generate
better code. (default OFF)
memmove_cannot_overlap[(OFF|ON)]
The compiler assumes by default that the
operands of the "memmove" routine can
overlap. This option allows the compiler
to assume that the operands do not overlap
and can thus generate better code.
(default OFF)
memmove_count=n
Specify the maximum number of instructions
that will be generated in the inline
expansion for the memory intrinsics.
(default 16)
- 5 -
3.3.2 _-_O_P_T_:_I_E_E_E___c_o_m_p_a_r_i_s_o_n_s_=_noption For
MIPSpro 7.2.1 the -OPT:IEEE_comparisons=n option
has been renamed to -OPT:IEEE_NaN_inf=n whose
definition under the -OPT option control group
is as follows:
IEEE_NaN_inf=n
IEEE_NaN_inf=ON forces all operations which might have
IEEE-754 NaN or infinity operands to yield results
that conform to ANSI/IEEE 754-1985, the IEEE Standard for
Binary Floating-point Arithmetic, which specifies
standard for NaN and inf operands. Specify ON or OFF for
setting. The default is IEEE_NaN_inf=OFF.
IEEE_NaN_inf=OFF produces non-IEEE results for various
operations. For example, x=x is treated as TRUE without
executing a test, and x/x will be simplified to 1
without dividing. Turning this option on may suppress
many such common optimizations and hurt performance as a
result.
For more information please consult the new man
page opt(5).
3.3.3 _N_e_w__O_p_t_i_o_n_s__a_n_d__D_e_f_a_u_l_t_s__i_n__M_I_P_S_p_r_o__7_._2
A new -_D_E_B_U_G:_o_p_t_i_o_n control group has been
created to allow insertion of code to assist in
the debugging of programs. For example,
-_D_E_B_U_G:_d_i_v__c_h_e_c_k=_N replaces -_T_E_N_V:_c_h_e_c_k__d_i_v=_N
and the 7.2 compiler, by default, inserts code
to check for divide by zero (N=1).
_N_O_T_E: The default value for -_T_E_N_V:_c_h_e_c_k__d_i_v=_N
under MIPSpro 7.1 was N=0 (no checks).
For more information, please refer to the _c_c(1)
and _D_E_B_U_G__g_r_o_u_p(5) man pages.
The -_L_I_S_T: options control group has been
enhanced to the create a listing file (.l) that
contains the values of all flags modified,
directly in the command line, or indirectly as a
side effect of other options. For example:
- 6 -
% cc -n32 -LIST:options=ON foo.c
will create foo.l which contains a listing that
contains the default values of certain options
from the -OPT, -LNO, -TARG and -TENV option
control groups.
The following command:
% cc -n32 -LIST:all_options=ON foo.c
will create foo.l which contains a listing that
contains the default values of all options from
all of the option control groups.
For more information, please refer to the _c_c(1)
man page.
3.3.4 _O_b_s_o_l_e_t_e__O_p_t_i_o_n_s Several compile-time
flags have been obsoleted. These include:
-_T_E_N_V:_m_i_s_a_l_i_g_n_e_m_n_t=_N, -_T_E_N_V:_a_l_i_g_n__e_x_t_e_r_n=_N and
-_T_E_N_V:_a_l_i_g_n_e_d=_T_R_U_E. Their use will generate a
warning message in both the compiler front-end
and backend. For example:
% cc -n32 -TENV:misalignment=3 reshape.c
Warning: Obsolete option "-TENV:misalignment=3" -- ignored
Warning: Obsolete option "-TENV:misalignment=3" -- ignored
The -_T_E_N_V:_v_a_r_a_r_g_s__p_r_o_t_o_t_y_p_e_s=_T_R_U_E flag has been
replaced by -_D_E_B_U_G:_v_a_r_a_r_g_s__p_r_o_t_o_t_y_p_e_s=_T_R_U_E.
For more information, please refer to the _c_c(1)
and _D_E_B_U_G__g_r_o_u_p(5) man pages.
3.3.5 _C_o_m_p_i_l_e_r__D_e_f_a_u_l_t_s When invoking the
compiler, ----33332222 ----mmmmiiiippppssss2222 is assumed on all machines
except those based on the R8000 processor.
There, the compilations default to ----66664444 ----mmmmiiiippppssss4444.
These defaults can, of course, be overridden at
the command line or through the use of the
_S_G_I__A_B_I environment variable. For more
information on these flags, please refer to the
_c_c(1), _f_7_7(1) and _a_b_i(5) man pages.
The MIPSpro 7.1 compiler introduced a new method
by which the user can customize the Application
- 7 -
Binary Interface (ABI), instruction set
architecture (ISA) and processor type used in
compilations where they are not explicitly
specified. Under this method, the
COMPILER_DEFAULTS_PATH environment variable can
be set to a colon separated list of paths where
the compiler will look for a _c_o_m_p_i_l_e_r._d_e_f_a_u_l_t_s
file. If no _c_o_m_p_i_l_e_r._d_e_f_a_u_l_t_s file is found or
if the environment variable is not set, the
compiler looks for /_e_t_c/_c_o_m_p_i_l_e_r._d_e_f_a_u_l_t_s. If
that file is not found either, the compiler
resorts to the built-in defaults described in
the _c_c(1)man pages and above. For a description
of the specification format of this file, please
refer to the _c_c(1)man pages.
3.3.6 _W_H_I_R_L _I_n_t_e_r_m_e_d_i_a_t_e _O_b_j_e_c_t _F_i_l_e _F_o_r_m_a_t
_C_h_a_n_g_e_s The format of WHIRL Intermediate
Object files has changed. If you have WHIRL
intermediate (.o) files left over from
compilations using MIPSpro 7.1 with
interprocedural optimization enabled (i.e.
-IPA), you must recompile the entire set of
files. Whirl Intermediate Object files are
compatible between MIPSpro 7.2.1 and MIPSpro
7.2.
3.3.7 _A_B_I__D_e_v_e_l_o_p_m_e_n_t For information about
ABI development issues, see the man pages
_a_b_i_c_c(_1), _a_b_i_l_d(_1), _c_h_e_c_k__a_b_i__c_o_m_p_l_i_a_n_c_e,
_c_h_e_c_k__a_b_i__i_n_t_e_r_f_a_c_e and _c_h_e_c_k__f_o_r__s_y_s_c_a_l_l_s.
3.3.8 _C_o_n_t_r_o_l_l_i_n_g__CCCC_GGGG__c_o_m_p_i_l_e_r__o_p_t_i_m_i_z_a_t_i_o_n_s _C_G
is the code-generation part of the compiler.
There are choices to be made in many parts of
_C_G, e.g. what conditional constructs should be
if-converted, or how much should a loop be
unrolled. In most cases the compiler should be
making reasonable decisions. But there are
still times when performance can be improved by
modifying the default behavior.
The following sections describe a few of the
ways that _C_G can be controlled by the user.
+o Non-loop if-conversion can be turned off
with -_C_G:_i_f_c__n_o_n__l_o_o_p=_o_f_f. Currently non-
loop if-conversion only applies to simple
- 8 -
if-then or if-then-else constructs with a
very few conditionally executed
instructions, so it should usually be
advantageous to do the if-conversion. (In
fact we may increase the amount of this
kind of if-conversion we do in future
releases). One reason the if-conversion
could be sub-optimal is that one of the two
paths through the code might be rarely
executed. (This can be controlled with
-_C_G:_b_o_d_y__f_r_e_q__f_b=_n. If some block in the
loop has frequency less than 1/n times the
frequency of the loop head, if-conversion
for that loop is disabled.)
+o If-conversion of innermost loops is
disabled with -_C_G:_i_f__c_o_n_v_e_r_s_i_o_n=_o_f_f. The
most likely reason that this would be
useful is that if-conversion has increased
the number of instructions in the loop by a
lot, and the loop is Software Pipelined, so
that there is no opportunity for
reverse_if_conversion to undo the damage.
Another way to protect against this
possibility is by setting the value of
-_C_G:_b_o_d_y__i_f_c__r_a_t_i_o. For example, if
-_C_G:_b_o_d_y__i_f_c__r_a_t_i_o=_2, and the number of
instructions in the loop grows by more than
a factor of 2 due to if-conversion, then
the if-conversion will be undone (and of
course there will then be no opportunity to
Software Pipeline that loop).
+o Cross iteration optimizations can be
disabled with -_C_G:_v_e_c_t_o_r__r_w__r_e_m_o_v_a_l=_o_f_f
(for read-read or read-write
optimizations), -_C_G:_v_e_c_t_o_r__w_w__r_e_m_o_v_a_l=_o_f_f
(for write-write optimizations), and/or
-_C_G:_c_r_o_s_s__i_t_e_r__c_s_e__r_e_m_o_v_a_l=_o_f_f (for common
sub-expression elimination). The reason to
do this is that these optimizations can
increase register pressure.
+o The unroll amount may be increased or
decreased. There is a heuristic controlled
by -_O_P_T:_u_n_r_o_l_l__a_n_a_l_y_s_i_s (on by default)
which is generally trying to minimize
unrolling, because less unrolling leads to
smaller code size and faster compilation.
Usually the only thing that makes it unroll
too much is its attempt to minimize the
- 9 -
cost of penalties for taken branches. If
you set the penalty for such a branch to 0
(-_C_G:_b_r_a_n_c_h__t_a_k_e_n__p_e_n_a_l_t_y=_0), or increase
the cost for taken branches that the
heuristic will tolerate (increase the value
of -_C_G:_u_n_r_o_l_l__a_n_a_l_y_s_i_s__t_h_r_e_s_h_o_l_d from its
default value of .1), you can probably
avoid having loops unrolled too much. You
can also change the upper bound for the
amount of unrolling with -_O_P_T:_u_n_r_o_l_l__t_i_m_e_s
(default is 8) or -_O_P_T:_u_n_r_o_l_l__s_i_z_e (the
number of instructions in the unrolled
body, current default is 80). In case the
heuristic is limiting unrolling too much,
it can be disabled with
-_O_P_T:_u_n_r_o_l_l__a_n_a_l_y_s_i_s=_o_f_f.
+o Software Pipelining can be disabled with
-_O_P_T:_s_w_p=_o_f_f. As far as CG is concerned,
-_O_3 -_O_P_T:_s_w_p=_o_f_f is the same as -_O_2.
However, since LNO does not run at -_O_2, the
input to CG can be very different, and the
available aliasing information can be very
different.
+o Reverse_if_conversion for non SWP'd loops
can be disabled with -_C_G:_r_e_v_e_r_s_e__i_f=_o_f_f.
3.4 _C_h_a_n_g_e_s__t_o__dddd_bbbb_xxxx_(_1_)
+o _d_b_x has been enhanced to allow debugging of
Fortran 90 allocatable arrays.
+o _d_b_x has been enhanced to allow debugging of
Fortran 90 assumed shape arrays.
+o _d_b_x has been enhanced to allow debugging of
C++ programs created with the -gslim
option. This option limits the amount of
debugging information generated by the C++
compiler for class definitions. You should
consider using this option on large
applications when you experience bloated
object files, executables, or DSOs when
compiling with -g. For more information
refer to the CC(1) man pages.
+o _d_b_x has been enhanced to allow debugging of
C++ code that contains exception handlers.
For more information, refer to the _D_B_X
_U_s_e_r'_s _G_u_i_d_e.
- 10 -
+o _d_b_x has been enchanced to support pthreads
debugging. For more information, refer to
the _D_B_X _U_s_e_r'_s _G_u_i_d_e.
3.5 _C_h_a_n_g_e_s__t_o__t_h_e__l_i_n_k_e_r__llll_dddd_(_1_)
This linker provides some new features and
better performance. For more information refer
to the _l_d(1) man page.
+o As of release 5.0.1, the linker can adjust
executables to avoid certain problems with
early versions of the R4000. If the
----nnnnoooo____jjjjuuuummmmpppp____aaaatttt____eeeeoooopppp flag is on (it is on by
default), small amounts of padding are
added between component objects to avoid
placing a branch instruction at the end of
a page. Slightly smaller executables and
significantly faster executables can result
by turning this option off (using the
----aaaalllllllloooowwww____jjjjuuuummmmpppp____aaaatttt____eeeeoooopppp flag). Binaries built
either way should be compatible across all
Silicon Graphics systems, but those made
with ----nnnnoooo____jjjjuuuummmmpppp____aaaatttt____eeeeoooopppp (the default) often
show performance gains on R4000 systems.
These flags are irrelevant for programs
compiled with ----mmmmiiiippppssss4444 because the R8000 and
R10000 processors do not have this hardware
bug and no padding is performed by the
linker.
However, early versions of the R5000 may
have problems if a jump or branch
instruction occurs at an address 8 bytes
before the end of an odd-numbered page, and
if a load or store instruction immediately
follows the jump or branch instruction. The
6.2 and above releases of the linker work
around this problem by padding sections of
object files that exhibit the
characteristics described above. This
occurs by default for object files compiled
for ----mmmmiiiippppssss4444. Binaries can be built without
this fix by using the
----aaaalllllllloooowwww____rrrr5555kkkk____jjjjuuuummmmpppp____aaaatttt____eeeeoooopppp option.
+o The 6.2 release of the linker introduced an
experimental new feature enabled by the
----mmmmuuuullllttttiiiiggggooootttt option. If you experience GOT
Overflow problems in building your
- 11 -
applications, you should try relinking with
the ----mmmmuuuullllttttiiiiggggooootttt option as an alternative to
recompiling with ----xxxxggggooootttt.
NOTE: The 7.2.1 release of the linker
enables ----mmmmuuuullllttttiiiiggggooootttt by default.
+o New options have been added to _l_d(1) for
aligning variables in the global
uninitialized data area (_b_s_s). See the
manual page for _l_d(1) for options with
names beginning with ----XXXX. These new options
are unique to IRIX and might change across
releases.
3.6 _A_s_s_e_m_b_l_e_r__(_aaaa_ssss_(_1_)_)
+o As of 6.2, the assembler supports 64-bit
instructions, and can generate 64-bit ELF
object files. The COFF format is not
supported. The 64-bit and N32 objects
contain DWARF debugging support rather than
MDEBUG.
+o The calling conventions and register usage
for 64-bit objects is different from the
32-bit conventions, so you should become
familiar with the new conventions. The
_M_I_P_S_p_r_o _6_4-_B_i_t _P_o_r_t_i_n_g _a_n_d _T_r_a_n_s_i_t_i_o_n _G_u_i_d_e
is useful for porting code from 32 to 64
bits. Also see the standard include files
<_r_e_g_d_e_f._h> and <_a_s_m._h> which are
parameterized for 32-bit or 64-bit code.
+o Most of the optimizations like software
pipelining and cross-basic-block scheduling
have been removed from the assembler; these
optimizations are now done in the back-end,
and thus only happen for high-level code.
The assembler still does instruction
scheduling for the user.
+o There are three new assembler directives
for the generation of 64-bit PIC
(Position-Independent Code). These
directives are ignored if not doing a 64-
bit shared (PIC) compile.
+o ._c_p_s_e_t_u_p _r_e_g, _r_e_g_2/_o_f_f_s_e_t, _l_a_b_e_l
- 12 -
By convention, reg == t9, and the label is
the procedure entry. The second argument
can be either another register (for the
case of a leaf routine with no frame) or a
stack offset, and is used to store the
value of $gp. This directive expands into:
ssssdddd ggggpppp,,,, ooooffffffffsssseeeetttt((((sssspppp))))
lllluuuuiiii ggggpppp,,,, %%%%hhhhiiii((((%%%%ggggpppp____rrrreeeellll((((llllaaaabbbbeeeellll))))))))
ddddaaaaddddddddiiiiuuuu ggggpppp,,,, ggggpppp,,,, %%%%lllloooo((((%%%%ggggpppp____rrrreeeellll((((llllaaaabbbbeeeellll))))))))
ddddaaaadddddddduuuu ggggpppp,,,, ggggpppp,,,, rrrreeeegggg
+o ._c_p_r_e_t_u_r_n
This directive expands into:
lllldddd ggggpppp,,,, ooooffffffffsssseeeetttt((((sssspppp))))
where "offset" is the same value used in
the previous .cpsetup.
+o The .cpsetup/.cpreturn sequence replaces
the .cpload/.cprestore sequence that is
used in 32-bit PIC code.
+o The other new directive is
._c_p_l_o_c_a_l _r_e_g_1
It specifies a register (typically not $gp)
to be used as context pointer. It has
effect only within a procedure (i.e., it is
turned off automatically at the end of each
procedure).
There are two new directives in the 7.00
-n32/-64 assembler:
+o ._d_y_n_s_y_m _n_a_m_e _v_a_l_u_e
This specifies the st_other field of the
symbol, which can be "sto_default",
"sto_internal", "sto_hidden", or
"sto_protected".
+o ._g_p_v_a_l_u_e _v_a_l_u_e
The gp value is used in %gp_rel relocations
as an offset for the addend. By default
the value is 0.
- 13 -
Chapter 8 of the _M_I_P_S_p_r_o _A_s_s_e_m_b_l_y _L_a_n_g_u_a_g_e
_G_u_i_d_e contains descriptions of all of the
directives supported by the assembler.
+o The _M_I_P_S_p_r_o _N_3_2 _A_B_I _H_a_n_d_b_o_o_k and the
_M_I_P_S_p_r_o _6_4-_b_i_t _P_o_r_t_i_n_g _a_n_d _T_r_a_n_s_i_t_i_o_n _G_u_i_d_e
contain examples of how to write assembly
language programs for the the N32 and 64-
bit ABI's respectively.
3.7 _L_i_b_r_a_r_i_e_s
The following changes to the libraries that are
part of the compiler system were made in the 7.1
release.
3.7.1 _R_e_p_a_c_k_a_g_i_n_g__o_f__N_3_2__S_u_b_s_y_s_t_e_m_s The
_c_o_m_p_i_l_e_r__d_e_v._s_w_3_2 subsystems have been bundled
into the _c_o_m_p_i_l_e_r__d_e_v._s_w subsystems and are no
longer present as independent subsystems.
3.7.2 _D_i_s_c_o_n_t_i_n_u_a_n_c_e _o_f _N_3_2 _a_n_d _6_4-_b_i_t _N_o_n-
_s_h_a_r_e_d _L_i_b_r_a_r_i_e_s N32 and 64-bit versions of
non-shared libraries for SPEC
(_c_o_m_p_i_l_e_r__d_e_v._s_w_3_2._s_p_e_c_l_i_b and
_c_o_m_p_i_l_e_r__d_e_v._s_w_6_4._s_p_e_c_l_i_b) are no longer being
shipped.
The following changes to the libraries that are
part of the compiler system were made in the 6.2
release.
+o The floating point exception handler
package (libfpe) has been rewritten and
released with support for programs compiled
under ----mmmmiiiippppssss3333 or ----mmmmiiiippppssss4444. Refer to the
_f_s_i_g_f_p_e(3f) and _h_a_n_d_l_e__s_i_g_f_p_e_s(3c) man
pages.
+o Fast floating point libraries (libfastm)
tuned for the R5000, R8000 and R10000,
respectively, are now available when doing
compilation for the 64-bit and N32 ABI's.
New ----rrrr5555000000000000, ----rrrr8888000000000000 and ----rrrr11110000000000000000 compiler
flags are provided which add the paths of
these libraries to the head of the library
search path. For more information refer to
the _c_c(1) and _f_7_7(1) man pages.
- 14 -
3.8 _P_e_r_f_o_r_m_a_n_c_e__T_o_o_l_s
This section includes changes to _p_i_x_i_e(1),
_p_i_x_s_t_a_t_s(1), _p_r_o_f(1).
+o As of the 7.0 release, _p_i_x_i_e(1),
_p_i_x_s_t_a_t_s(1), and _p_r_o_f(1) are no longer
supported. Their functionality has been
integrated into a new product called
SpeedShop. Interested users are referred to
SpeedShop's release notes for more
information.
3.9 _L_i_b_r_a_r_y__a_n_d__S_y_s_t_e_m__C_a_l_l__F_u_n_c_t_i_o_n_a_l_i_t_y
The following additions and changes were made to
library and system call functionality between
versions 5.3 and 6.2 of the IRIS Development
Option (now being replaced by the IRIX
Development Foundation).
+o The MIPSpro C compiler supports long double
arithmetic using the ANSI C standard
syntax. Most of the standard
transcendental functions in _l_i_b_m and _l_i_b_c
are supported. See specific man pages for
names and prototypes. Most of the long
double routines are named by prefixing the
letter 'q' to the double precision
routine's name; for example, _q_s_i_n is the
long double version of _s_i_n.
The following long double routines are NOT
supported in this release: _a_c_o_s_h, _a_s_i_n_h,
_a_t_a_n_h, _c_b_r_t, _d_r_a_n_d_4_8, _d_r_e_m, _e_r_a_n_d_4_8, _e_x_p_m_1.
See the man page for _m_a_t_h(3M) for details
regarding long double arithmetic. Note
that long double operations on this system
are only supported in "round to nearest
rounding" mode (the default). The system
must be in "round to nearest rounding" mode
when issuing long double arithmetic
operations or calling any of the long
double functions, or incorrect answers will
result.